Performance-Driven Architectural Synthesis for Distributed Register-File Microarchitecture with Inter-Island Delay

نویسندگان

  • Juinn-Dar Huang
  • Chia-I Chen
  • Wan-Ling Hsu
  • Yen-Ting Lin
  • Jing-Yang Jou
چکیده

In deep-submicron era, wire delay is becoming a bottleneck while pursuing higher system clock speed. Several distributed register (DR) architectures are proposed to cope with this problem by keeping most wires local. In this article, we propose the distributed register-file microarchitecture with inter-island delay (DRFM-IID). Though DRFM-IID is also one of the DR-based architectures, it is considered more practical than the previously proposed DRFM, in terms of delay model. With such delay consideration, the synthesis task is inherently more complicated than the one without inter-island delay concern since uncertain interconnect latency is very likely to seriously impact on the whole system performance. Therefore we also develop a performance-driven architectural synthesis framework targeting DRFM-IID. Several factors for evaluating the quality of results, such as number of inter-island transfers, timing-criticality of transfer, and resource utilization balancing, are adopted as the guidance while performing architectural synthesis for better optimization outcomes. The experimental results show that the latency and the number of inter-cluster transfers can be reduced by 26.9% and 37.5% on average; and the latter is commonly regarded as an indicator for power consumption of on-chip communication. key words: Behavioral synthesis, distributed register-file, performance optimization, low-power, resource binding, scheduling

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Communication Synthesis for Interconnect Minimization Targeting Distributed Register-File Microarchitecture

In deep-submicron era, wire delay is becoming a bottleneck while pursuing even higher system clock speed. Several distributed register (DR) architectures have been proposed to cope with this problem by keeping most wires local. In this article, we propose a new resourceconstrained communication synthesis algorithm for optimizing both interisland connections (IICs) and latency targeting on distr...

متن کامل

A Hierarchical Criticality-Aware Architectural Synthesis Framework for Multicycle Communication

In deep submicron era, wire delay is no longer negligible and is becoming a dominant factor of the system performance. To cope with the increasing wire delay, several state-of-the-art architectural synthesis flows have been proposed for the distributed register architectures by enabling on-chip multicycle communication. In this article, we present a new performance-driven criticality-aware synt...

متن کامل

Lazy Retirement: A Power Aware Register Management Mechanism

In this paper we describe "Lazy Retirement" a poweraware improvement to the Intel’s P6 family microarchitecture. Lazy Retirement significantly reduces the energy and power involved in register retirement. Lazy Retirement delays the copy from the physical register file (ROB) to the architectural (real) register file (RRF) until it has no choice and the physical register has to be re-used. In man...

متن کامل

Scalable Distributed Register File

In microarchitectural design, conceptual simplicity does not always lead to reduced technological complexity. VLSI design offers several standard structures which get very inefficient when they are scaled up. For instance, the superscalar OOO processing model is conceptually simple – with the controlflow oriented front-end and the dataflow oriented backend – but simply scaling the structures in...

متن کامل

Design Space Exploration to Find the Optimum Cache and Register File Size for Embedded Applications

In the future, embedded processors must process more computation-intensive network applications and internet traffic and packet-processing tasks become heavier and sophisticated. Since the processor performance is severely related to the average memory access delay and also the number of processor registers affects the performance, cache and register file are two major parts in designing embedd...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEICE Transactions

دوره 95-A  شماره 

صفحات  -

تاریخ انتشار 2012